less than the number of groups.
The denominator degrees of freedom: This number is designated as
or
, which is the total
number of observations minus the number of groups.
The p value can be calculated from the values of F,
, and
, and the software performs this
calculation for you. If the p value from the ANOVA is statistically significant — less than 0.05 or your
chosen α level — then you can conclude that the group means are not all equal and you can reject the
null hypothesis. Technically, what that means is that at least one mean was so far away from another
mean that it made the F test result come out far away from 1, causing the p value to be statistically
significant.
Picking through post-hoc tests
Suppose that the ANOVA is not statistically significant (meaning F was larger than 0.05). It means that
there is no point in doing any t tests, because all the means are close to each other. But if the ANOVA
is statistically significant, we are left with the question: Which group means are higher or lower than
others? Answering that question requires us to do post-hoc tests, which are t tests done after an
ANOVA (post hoc is Latin for “after this”).
Although using post-hoc tests can be helpful, controlling Type I error is not that easy in reality. There
can be issues with the data that may make you not trust the results of your post-hoc tests, such having
too many levels to the group you are testing in your ANOVA, or having one or more of the levels with
very few participants (so the results are unstable). Still, if you have a statistically significant ANOVA,
you should do post-hoc t tests, just so you know the answer to the question stated earlier.
It’s okay to do these post-hoc tests; you just have to take a penalty. A penalty is where you
deliberately make something harder for yourself in statistics. In this case, we take a penalty by
making it deliberately harder to conclude a p value on a t test is statistically significant. We do
that by adjusting the α to be lower than 0.05. How much we adjust it depends on the post-hoc test
we choose.
The Bonferroni adjustment uses this calculation to determine the new, lower alpha: α/N, where
N is the number of groups. As you can tell, the Bonferroni adjustment is easy to do manually! In the
case of our three marital groups (M, NM, and OTH), our adjusted Bonferroni α would be 0.05/3,
which is 0.016. This means that for a post-hoc t test of average fasting glucose between two of the
three marital groups, the p value would not be interpreted as significant unless the it was less than
0.016 (which is a tougher criterion than only having to be less than 0.05). Even though the
Bonferroni adjustment is easy to do by hand, because most analysts use statistical packages when
doing these calculations, it is not used very often in practice.
Tukey’s HSD (“honestly” significant difference) test adjusts α in a different way than
Bonferroni. It is intended to be used when there are equally-sized groups in each level of the
variable (also called balanced groups).
The Tukey-Kramer test is a generalization of the original Tukey’s HSD test to designed to handle
different-sized (also called unbalanced) groups. Since Tukey-Kramer also handles balanced